Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Configurable Delay option #268

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sww1235
Copy link

@sww1235 sww1235 commented Nov 16, 2023

This helps avoid the rate-limiting introduced by archive.org

Alternative to #266 that allows for a configurable delay rather than a hardcoded delay. it also introduces the same delay when fetching snapshots

Closes #267, maybe #264, #244 and #246

Reference to archive.org implementing rate limiting: https://archive.org/details/toomanyrequests_20191110

This helps avoid the rate-limiting introduced by archive.org
@sww1235
Copy link
Author

sww1235 commented Nov 16, 2023

I was unable to run rake on this due to an error with rake. I don't think i broke any of the tests but let me know if you need anything fixed.

@sww1235
Copy link
Author

sww1235 commented Nov 16, 2023

I picked -n for the delay option short form based on the linux nice command. the other good options were already used. Feel free to change this if desired.

This was referenced Nov 20, 2023
@lcorbasson
Copy link

@sww1235 it would be nice to update README.md too with your new config option.

@MatthewTingum
Copy link

I think it would be nice to include a message on "connection refused" too: Connection was refused. You may be rate limited. Trying increasing the rate-limit value. See : github.com/foo/bar.

@Theta-Dev
Copy link

Theta-Dev commented Dec 2, 2023

Note: you should add the download delay after the check if the file exists (this line):
https://github.com/hartator/wayback-machine-downloader/pull/268/files#diff-012e3d978c45d5eff042c16d88ed89dd9e302c0d3fa43df46a87f82f957fafacL266

Otherwise you will be waiting a long time if you are resuming a partial download.

There should also be a configurable amount of retry attempts per file.

@JomSpoons
Copy link

Great fork, has completely solved the issues I was having before. Thank you.

@Forage
Copy link

Forage commented Jan 10, 2024

@sww1235 you are referring to the submit rate limit implemented quite some time ago. It would be good to know what the actual download rate limit is to make sure the default 4 second is actually a sane default delay.

With #267 (comment) it might be that this work-around is no longer essential, but it would be good to have either way.

As for the parameter naming, what about --download-interval (or --interval if you insist on a short name) and -i, to have more self-explanatory names?

@MWigginsIII
Copy link

Hello, I'm not a coder just a regular guy that ran into this problem. Is there anywhere I can read or watch on how to fix it? I see your fixes but how do I do it? Thanks

@hlorofos
Copy link

The delay actually not the delay in between downloads/requests, but in between files processing,
For example I'm resuming a process which have a lot of already exists. files and the delay is applied even though the request to the service is not performed for such files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download fails
8 participants